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Abstract 

The points at which the log likelihood falls by ^ from its maximum value are often 
used to give the 'errors' on a result, i.e. the 68% central confidence interval. The validity of 
this is examined for two simple lifetime measurement and a Poisson measurement. 

Results are compared with the exact Neyman construction and with the simple Bartlett 
approximation. It is shown that the accuracy of the log likelihood method is poor, and 
the Bartlett construction explains why it is flawed. 



1. Introduction 

In the hmit where the number of measurements N is large, the variance of the maxi- 
mum likelihood estimator a of a parameter a is given by 



and the quoted error era = ^/V{a) can be read off the parabolic likelihood curve from the 
points at which the likelihood L{a) falls by ^ from its peak value L{a): AlnL = — |. 

For experiments with finite N a similar procedure is in general use: the values a± 
below and above a for which AlnL = lnL{a±) — lnL{a) — — | are found, and the 68% 
central confidence interval quoted as [a_ , o_|_] or [a — a"_ , a + a+] . 

This is given a somewhat non-rigorous justification [1,2,3]: even though the log like- 
lihood curve for a may not be a parabola, the parameter a could be converted to some a' 
for which the log likelihood curve is parabolic; symmetric errors aa' could be read off in 
the standard way, and the a' interval converted back to the corresponding interval for a. 
The invariance of the maximum likelihood formalism then ensures that this interval is just 
the AlnL = — | interval for a. 

This practice is now being questioned [4,5,6] and an examination of how well it actually 
works in practice is needed to inform this discussion. In this note we consider two typical 
cases where Maximum Likelihood estimation is used: the determination of the lifetime of 
an unstable state decaying according to the radioactive decay law, and the determination 
of the number of events produced by a Poisson process. In these we can determine the 
interval produced by the AlnL — recipe and contrast them with the exact Neyman 
interval. This is found [2,7] from the values satisfying: 



/ P{a';a+)da' = 0.16 
Jo 

/•OO 

/ P(a';a_)(ia' = 0.16 

J a 



(2) 



where P{a; a) is the probability density for a true value a giving an estimate a. These 
equations define the confidence belt such that the probability of a measurement lying 
within the region is, by construction, 68%. 

An alternative approximation technique is that of Bartlett [1,7,8]. For any the 

quantity is distributed with mean zero and variance — ^ '^Jap' ) • large the 

Central Limit Theorem prescribes that ^^^^ = dlnP{x^,a) ^ sum of random 
quantities, is Gaussian. If this quantity can be expressed in terms of d — (d) this can be 
used to give confidence regions for d. Further refinements can be used to correct for the 
non-Gaussian finite A^ behaviour, but these lie beyond the scope of this work. 

This note uses the 68% central confidence region for illustration, but the techniques 
can be applied to central or one-sided regions with any probability content. 

Bayesian statistics can also be used to give confidence intervals. This is an entirely 
different techique, and is not considered here. This study compares the exact Neyman 
confidence intervals with two methods which claim to approximate to them. 
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2. Lifetime Measurements 

The probability for a state with mean hfetime r to decay after an observed time t is 
given by 

P(t-T) = -e-'/\ (3) 

T 

The log likelihood for N measurements ti . . An is 

InL = -iV- - Nlnr (4) 

T 

where t — jj'^ti. Differentiation to find the maximum immediately gives r = t and 
lnL{f) = —N{1 + lnt). The problem scales with r/t, and without loss of generality we can 
take i — 1. We consider the 68% confidence region for various values of N. 

The probability of obtaining a particular value of t contains a term e~^^^'^ from 
equation 3, and a factor ^ from the convolution. Normalisation gives (see [5], Equation 
4) 

For the exact Neyman region we require the integral of this quantity from zero to the 
measured value, which is to be 16% for the upper limit t_|_ = t + a"_|_ and 84% for the lower 
limit T_ — t~(j-. This is given by 

TV— 1 -r 



^0 



The region thus obtained, expressed as differences from the measured t of 1, is shown 
in the columns 2 and 3 of Table 1, for values between A'' = 1 to A'' = 25. 



N 


Exact 




A InL 




1 

2 


Bartlett 






cr_ 




cr_ 






cr_ 


(^+ 


1 


0.457 


4.787 


0.576 


2, 


.314 


0.500 


oo 


2 


0.394 


1.824 


0.469 


1, 


.228 


0.414 


2.414 


3 


0.353 


1.194 


0.410 


0, 


.894 


0.366 


1.366 


4 


0.324 


0.918 


0.370 


0, 


.725 


0.333 


1.000 


5 


0.302 


0.760 


0.340 


0, 


.621 


0.309 


0.809 


6 


0.284 


0.657 


0.318 


0, 


.550 


0.290 


0.690 


7 


0.270 


0.584 


0.299 


0, 


.497 


0.274 


0.608 


8 


0.257 


0.529 


0.284 


0, 


.456 


0.261 


0.547 


9 


0.247 


0.486 


0.271 


0, 


.423 


0.250 


0.500 


10 


0.237 


0.451 


0.260 


0, 


.396 


0.240 


0.463 


15 


0.203 


0.343 


0.219 


0, 


.310 


0.205 


0.348 


20 


0.182 


0.285 


0.194 


0, 


.261 


0.183 


0.288 


25 


0.166 


0.248 


0.176 


0, 


.230 


0.167 


0.250 



Table 1: 68% Confidence regions obtained by the 3 methods for a lifetime measurement 
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The AlnL = — ^ points can be found numerically from Equation 4. These are shown 
in columns 4 and 5 of Table 1 . 

For the Bartlett approximation, the differential of Equation 4 gives ^(t — r), and the 
expectation value of the second differential gives the variance of this as Thus for a 
given r the probability distribution for t has mean r and standard deviation r / \/N. This 
is exact. We then - this is the approximation - take this as being Gaussian and use it 
in the Neyman prescription, accordingly requiring that t lie one standard deviation above 
r_ =t — a- and one standard deviation below t+ = t + a+ 



t = T- + 



T- 



N 



t 



(7) 



I.e. a_ 



—r^ — and CT-L 

VN+l ^ 



The results are also presented graphically in Figure 1. 



These are shown in the final two columns of Table 1. 




Figure 1: Upper and lower limits on the 68% central confidence interval for a lifetime 
measurement showing the exact construction (red), the Bartlett approximation (blue) and 
the AlnL approximation (green) 



Two points emerge, from both Table 1 and Plot 1. One is that the Bartlett approxi- 
mation does surprisingly well (except at very small A^, of order 1). The second is that the 
Log likelihood approximation does surprisingly badly. For ~ 10 the differences are of 
order 10%. The convergence towards agreement is clearly slow. 
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3. Poisson Measurements 

If N events are seen from a Poisson process, Equation 2 gives the upper and lower 
hmits of the 68% central region as 

i:e--^ = 0.16 ^e-''-^ = OM. (8) 
■ ■ 

These are shown in columns 2 and 3 of Table 2 for a range of values of A^. The AlnL = — ^ 
errors are read off A'" — A + Nln{X/N). These are shown in columns 4 and 5 of Table 2. 



N 


Exact 






AlnL 




1 

2 


Bartlett 






(j_ 




^+ 


cr_ 






cr_ 


^+ 


1 


0.827 


2. 


,299 


0.698 


1, 


.358 


1.118 


2.118 


2 


1.292 


2. 


.637 


1.102 


1, 


.765 


1.500 


2.500 


3 


1.633 


2. 


.918 


1.416 


2, 


.080 


1.803 


2.803 


4 


1.914 


3. 


.162 


1.682 


2, 


.346 


2.062 


3.062 


5 


2.159 


3, 


.382 


1.916 


2, 


.581 


2.291 


3.291 


6 


2.380 


3, 


.583 


2.128 


2, 


.794 


2.500 


3.500 


7 


2.581 


3. 


.770 


2.323 


2, 


.989 


2.693 


3.693 


8 


2.768 


3. 


.944 


2.505 


3 


.171 


2.872 


3.872 


9 


2.943 


4. 


.110 


2.676 


3 


.342 


3.041 


4.041 


10 


3.108 


4. 


.266 


2.838 


3 


.504 


3.202 


4.202 


15 


3.829 


4. 


.958 


3.547 


4, 


.213 


3.905 


4.905 


20 


4.434 


5. 


.546 


4.145 


4, 


.811 


4.500 


5.500 


25 


4.966 


6. 


.066 


4.672 


5, 


.339 


5.025 


6.025 



Table 2: 68% Confidence regions obtained by the 3 methods for a Poisson measurement 

The Bartlett method gives the familiar fact that the variance of n — A is just A. This 
suggests that 

n — A_ = \/A_ X+ — n = \/A+. 

However P(n; A) is defined for integer n only. To make this set of discrete spikes look like a 
Gaussian requires us to replace it by a histogram where the value is defined as exp~^ X'^/nl 
for values of the continuous abscissa variable between n — ^ and n + | . This requires us 
to add I to each of the ranges, giving 

^- = \l^+\ ^+ = -^^+^ + 1 (9) 

These are shown in columns 6 and 7 of Table 2. The data are shown graphically in Figure 
2. 
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Figure 2: Upper and lower limits on the 68% central confidence interval for a Poisson 
measurement, showing the exact construction (red), the Bartlett approximation (blue) and 
the AlnL approximation (green) 



Again, the Bartlett approximation does surprisingly well, and the InL approximation 
surprisingly badly. Furthermore, in this case it underestimates both errors, which will 
inevitably lead to a smaller than desired coverage. (This could be remedied by adding 
0.5 to each limit, to account for the discrete binning, though this is still worse than the 
Bartlett approximation, as can be seen from Table 2.) 

4. Summary 

The poor behaviour of the log likelihood error approximation can be understood within 
the Bartlett approximation. The distribution for ^^-^^ is re-expressed in terms of a distri- 
bution for a — a which is assumed to be Gaussian 



p{a; a) 



-{a-af /2a{af 



27ra{a) 



(10) 



where the notation a (a) makes the point that the variance of this Gaussian depends on a. 

The 68% limits are given by finding the a for which d — a = ±a(a). These do indeed 
correspond to a fall of ^ in the log likelihood from the exponential. However the total 
log likelihood also changes with a due to the — Ina(a) from the denominator. The simple 
AlnL = — ^ method considers all factors together, and thus wrongly includes this term. 
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The inaccurary of the logarithmic method is appreciable. For reasonable values of 
it is generally wrong in the second significant figure, and often pretty grossly wrong. That 
this occurs for both cases examined suggests that this is true in general. And yet values 
obtained by this method are frequently quoted to considerable precision by experiments. 

In the complicated likelihood functions used in real experimental results, a simple 
Bartlett approach may not be possible. However the logarithmic approximation clearly 
does not provide the accuracy with which experiments wish to report their results. An 
alternative, available today but not in the 1950's when these techniques were developed, 
is to use the known Likelihood function to perform the Neyman construction using Monte 
Carlo integration (the so-called 'toy Monte Carlo'). This should be strongly recommended. 
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